Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict Python Version Mismatch between Pickled Object and Remote Envrionment #2848

Merged
merged 8 commits into from
Oct 30, 2024

Conversation

Mecoli1219
Copy link
Contributor

@Mecoli1219 Mecoli1219 commented Oct 21, 2024

Tracking Issues

flyteorg/flyte#5907

Why are the changes needed?

In the Jupyter Notebook environment, we are using cloudpickle to package the entity at the binary level. When we load the pickled object with a different Python version, it will throw an unexpected error. Therefore, we should restrict loading pickled objects with the mismatched Python version.

What changes were proposed in this pull request?

Let's assume the Python version for the pickled object is X.

  1. If we load it in a Python version smaller than X, it is possible to crash. Therefore, we need to handle loading exceptions.
  2. If we successfully load the object, we need to check whether the Python version is matched. I added a new metadata in the pickled object to store the Python version when pickling, and this will be used for validation after loading.
  3. Enable logging error messages when trying to extract the output from remote execution so that the user doesn't need to go to Console to get the error message.

How was this patch tested?

I set the Python version of Jupyter Notebook to 3.11 and built 3 images with different Python versions for remote execution, including 3.9, 3.11, and 3.12.

Setup process

Screenshots

  • Remote Python version = 3.12
    image
  • Remote Python version = 3.11
    image
  • Remote Python version = 3.9
    image

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Signed-off-by: Mecoli1219 <[email protected]>
@Mecoli1219
Copy link
Contributor Author

cc @thomasjpfan

Signed-off-by: Mecoli1219 <[email protected]>
Signed-off-by: Mecoli1219 <[email protected]>
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 27.58621% with 21 lines in your changes missing coverage. Please review.

Project coverage is 38.49%. Comparing base (f2c4881) to head (bc53ba3).
Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/core/python_auto_container.py 34.78% 15 Missing ⚠️
flytekit/remote/remote.py 0.00% 5 Missing ⚠️
flytekit/remote/executions.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2848      +/-   ##
==========================================
- Coverage   40.56%   38.49%   -2.07%     
==========================================
  Files         196      199       +3     
  Lines       20526    20781     +255     
  Branches     2642     2671      +29     
==========================================
- Hits         8326     8000     -326     
- Misses      12096    12567     +471     
- Partials      104      214     +110     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Mecoli1219 <[email protected]>
Signed-off-by: Mecoli1219 <[email protected]>
queue: typing.List[typing.Union[WorkflowBase, PythonTask, CoreNode]] = [root_entity]
pickled_target_dict = {}
pickled_target_dict = {
"metadata": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this shares a namespace with task names right? so if someone makes a task where entity.name is "metadata" then this metadata will be lost and when we go to look for it we'll get a KeyError or something.

I don't think it needs to be here though, this is a pretty standard thing for all tasks. There's a couple places I think we can fit this information.

  • RuntimeMetadata's flavor field. I don't think this is being used right now. The name "runtime" is not correct. It's more like "compile-time", but that's a different problem.
  • Something in these tags. To my knowledge these tags aren't really being used right now.

I think i have a slight preference for the first one. what do you think @eapolinario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the first approach will be enough though. It is possible that we have to extend it to more fields (like maybe the package version?) in the future, and the first approach will have only 1 string that can store this value.

I think that this metadata is primarily used for interactive mode since only pickling objects will potentially cause the version mismatch problem. When creating a pickled object, the versions should follow with this object. I am thinking of moving the entities dictionary one layer down, so that user could have define function with name metadata

{
  "metadata": {
    "python_version": "3.12",
    ...
  },
  "entity_dict": {
    "task1": ...
    ...
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your call. I like the idea of adding the entity_dict key in general. I think we should do that anyways.

Your call on the metadata. The issue with just adding fields is that this risks becoming an undocumented API, so I was thinking we'd just use the existing flavor until we needed more than one field. If you go the pickled dict route though, just make sure it's well documented/commented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your concern. How about creating a dataclass to define the pickled object? In that way, people could know what is inside this object.

Regarding the metadata, the pickling only affects the Python side and doesn't need to be stored in the backend. The main issue this PR addresses is the Python version mismatch between pickled files and remote execution. By sending the Python version inside the pickled file, we can ensure the correctness of the version. Also as far as I researched, I think that runtimeMetadata is currently used for flytekit's language and version information rather than the Python version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good! just make sure using dataclass doesn't break the pickling.

@Mecoli1219
Copy link
Contributor Author

@wild-endeavor I push the dataclass version and test it with the task, workflow, map_task examples. My Python version for Jupyter Notebook is 3.12.0.

  • Cluster: 3.12.7
    image
  • Cluster: 3.11.0
    image

Signed-off-by: Mecoli1219 <[email protected]>
@eapolinario eapolinario enabled auto-merge (squash) October 30, 2024 19:34
@eapolinario eapolinario merged commit 5bf0acd into flyteorg:master Oct 30, 2024
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants